********************************************************************************
*Project: 	Richardson, Piper, and Lewis Vacancies and Performance Project
*Date: 		January 27, 2024 (September, 2023)
*Data: 		01_lpr_012824.xlsx
*Note: 		This batch file includes the commands to read in the vacancies and 
*			performance data, estimate the models in the text and then the 
*			models in the appendices.
********************************************************************************
**Before we begin in earnest, we load in some data on quarterly vacancies from
**the Federal Yellow Book, save them and save them as a STATA file for later.
**We will use this data to look at vacancies at different points to see if that
**matters for performance.


**Here we import data on agency performance and vacancies from an Excel file. 
import excel "data\01_lpr_012824.xlsx", sheet("Sheet1") firstrow


**Create some necessary variables: Vacant months, Employment in 1,000s
generate mvac= (1+Total_Vacant_Days)/30
generate dpm= directPAS*mvac
generate emp = employ16/1000
generate lnemp=ln(emp)
generate mvsq=mvac^2

**First, create some labels for the output tables from outreg. Outreg is a 
**command for easing the process of exporting regression output.

label variable mvac "Months Vacant (0-42)"
label variable cabinet "Cabinet Department (0,1)"
label variable direct "Direct PAS Leader (0,1)"
label variable dpm "Months*Direct PAS"
label variable eop "EOP (0,1)"
label variable offsec "Whole Department (0,1)"
label variable indcom "Independent Commission (0,1)"
label variable emp "Employees (1000s)"
label variable skills_mean_2014 "Workforce Skill -- Obama Administration"
label variable depprior "Priority Department (0,1)"
label variable bprior "Priority Bureau (0,1)"
label variable ideo_rating "Agency Ideology (L-C)"
label variable tott "Leadership Transitions (0-4)"
label variable Obama_Vacancy "Number of Vacant Days--Obama"
label variable Bush_Vacancy_Length_Total "Number of Vacant Days--Bush"
label variable lnemp "Ln(Employment in 1,000s)"
label variable mvsq "Months Vacant^2"

*******************************INFERENCE PART***********************************
**Here we try and see if vacancies are correlated with skills, agency ideology, and presidential priority.
**This will guide efforts to further evaluate whether vacancies affect different subsets of agencies differently.
**Inference models

reg mvac skills_mean_2014 directPAS offsec eop cabinet indcom depp bprior ideo_rating emp, cluster(dept)
outreg using p_l_table1.docx , replace se bdec(3) starlevels(10 5) varlabels squarebrack title (Models of PAS Position Vacancy Length, 2017-2020) 

********************************MAIN ANALYSIS********************************************************************************************************************************************
**We reference raw correlations in the section where we introduce vacancy length.

corr mvac hier

**This the main analysis for the paper. Here are the models regressing agency performance on vacancies.
**TABLE 1
reg hier mvac directPAS tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg hier mvac mvsq directPAS tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")
 
reg hier mvac tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "PAS Head Only")

reg hier mvac mvsq tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "PAS Head Only")

**Regression Diagnostics

reg hier mvac directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating
hettest
lvr2plot, mlabel(bureau_acr)
predict lev, leverage
predict r, resid
avplots

**Generally, a point > (2*#predictors+2)/N should be looked at carefully. In this
**model this would be a point > 22/125=0.18

list bureau_acr lev if lev>0.18&lev~=.

**There are 5 agencies that fit this category: CEQ, CEA, CPSC, FERC, OMB, and ONDCP
**I will reestimate the model without these cases.

reg hier mvac directPAS tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if bureau_acr~="CEQ"&bureau_acr~="CEA"&bureau_acr~="CPSC"&bureau_acr~="FERC"&bureau_acr~="OMB"&bureau_acr~="ONDCP", cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg hier mvac mvsq directPAS tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if bureau_acr~="CEQ"&bureau_acr~="CEA"&bureau_acr~="CPSC"&bureau_acr~="FERC"&bureau_acr~="OMB"&bureau_acr~="ONDCP", cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")
 
reg hier mvac directPAS tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1& bureau_acr~="CEQ"&bureau_acr~="CEA"&bureau_acr~="CPSC"&bureau_acr~="FERC"&bureau_acr~="OMB"&bureau_acr~="ONDCP", cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "PAS Head Only")

reg hier mvac mvsq directPAS tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1& bureau_acr~="CEQ"&bureau_acr~="CEA"&bureau_acr~="CPSC"&bureau_acr~="FERC"&bureau_acr~="OMB"&bureau_acr~="ONDCP", cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "PAS Head Only")

**We have also looked at the added variable plots and they look OK.

**Now residuals

kdensity r, normal
rvfplot, yline(0)

 **Table 2. See 04_table_2.R and ReadMe file
 
 **Figures 1-4 created using R. See those files
 
 **Table 3. Excluding cases where Democrats and Republicans disagree
generate rate_diff=0 if hier~=.
replace rate_diff=1 if sig_dif_90==1
 
reg hier mvac directPAS tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if rate_diff==0, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("Agency Performance (-1.96, 1.41)" "All Agencies")

reg hier mvac mvsq directPAS tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if rate_diff==0, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-1.96, 1.41)" "All Agencies")
 
reg hier mvac tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1 & rate_diff==0, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-1.96, 1.41)" "PAS Head Only")

reg hier mvac mvsq tott   eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1&rate_diff==0, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-1.96, 1.41)" "PAS Head Only")
  
**********APPENDICES************************************************************
 
 **Appendix A includes estimation details and partisan non-response. See A1_pty_non_response.R
 
 **Appendix B includes comparison of hierarchical and non-hierarchical models, informed vs. naive priors. See B1_performance_ratings_comp.R
 
 **Appendix C includes the numerical estimates themselves. See Appendix A and ReadMe file for details.
 
 **Appendix D includes the validation of the measures. See D1_validation_figure.R
 
 **External Validation of Measure Using GAO Recommendations Database. These are the commands for Tables D1 and D2 in Appendix D.

joinby bureau_acr using "data\03_gao_recs_replication.dta", unmatched(master)

**2020 Analysis
label variable gao_pri_rec_2020 "GAO Open Priority Recommendations (2020)"
label variable gao_all_rec_2020 "GAO Open Recommendations (2020)"
label variable gao_pri_rec_2021 "GAO Open Priority Recommendations(2021)"
label variable hier "Performance Rating"

nbreg gao_pri_rec_2020 hier lnemp , cluster(dept_acr)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("GAO Open Priority Recommendations (2020)")
mfx

nbreg gao_all_rec_2020 hier lnemp, cluster(dept_acr)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("GAO Open Recommendations")
mfx

**Only Cases Where Rs and Ds Agree
nbreg  gao_pri_rec_2020 hier lnemp if sig_dif_90~=1, cluster(dept_acr)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("GAO Open Priority Recommendations (2020)")
mfx

nbreg  gao_all_rec_2020 hier lnemp if sig_dif_90~=1, cluster(dept_acr)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("GAO Open Recommendations")
mfx

**2021 Analysis
nbreg  gao_pri_rec_2021 hier lnemp , cluster(dept_acr)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("GAO Open Priority Recommendations (2021)")
mfx

nbreg  gao_all_rec_2021 hier lnemp , cluster(dept_acr)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("GAO Open Recommendations")
mfx

nbreg  gao_pri_rec_2021 hier lnemp if sig_dif_90~=1 , cluster(dept_acr)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("GAO Open Priority Recommendations (2021)")
mfx

nbreg  gao_all_rec_2021 hier lnemp if sig_dif_90~=1, cluster(dept_acr)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("GAO Open Recommendations")
mfx

**Appendix E is about collecting data on vacancies and turnover and has no replication.

**Appendix F is an appendix with alternate specifications and sensitivity analysis. Except for the models of hierarchy (Table F3) in this appendix, 
**See F_ovb_sensitivity.R for other analysis

** The Effect of Higher Level Vacancies on Federal Agency Performance

generate numvac=next_up_vacant_days/30
generate mvnumv = mvac  *  numvac

label variable numvac "Months Vacant Higher Position"
label variable mvnumv "Months Vacant*Months Vacant Higher Position"

reg hier mvac tott depp skills_mean_2014 ideo_rating numvac, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("Agency Performance (-2.5, 2.5)" "Agencies with appointee layers")

reg hier mvac tott depp skills_mean_2014 ideo_rating numvac mvnumv, cluster(dept)
display "adjusted R2 = " e(r2_a)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "Agencies with appointee layers")


**Appendix G is an appendix with different specifications related to the timing of vacancies. Here are the commands for this analysis

generate mcon = Days_2_Con/30
generate mcsq=mcon^2

label variable mcon "Months to Initial Confirmation (0-42)"
label variable mcsq "Months to Initial Confirmation^2"

** Table G1 - Here are regressions substituting days to confirmation with vacancies.

reg hier mcon directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg hier mcon mcsq directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")
 
reg hier mcon directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "PAS Head Only")

reg hier mcon mcsq directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "PAS Head Only")

**Table G2 - Now I create a variable which is the number of vacant months excluding the initial vacancy.

generate revv = mvac - mcon
generate revsq=revv^2

label variable revv "Vacant Months Other than 1st Vacancy (0-28)"
label variable revsq "Vacant Months Other than 1st Vacancy^2"

reg hier mcon revv directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg hier mcon mcsq revv revsq directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg hier mcon revv tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "PAS Head Only")

reg hier mcon mcsq revv revsq tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating if directPAS==1, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "PAS Head Only")

**Now I add in data from Federal Yellow Book on which quarters were vacant from 2017-2020.

joinby bureau_acr using "data\02_q_vacancies_data_trump.dta", unmatched(master) _merge(merge_2)

**The following commands create 15 indicator (0,1) variables for whether a position
**was vacant during that quarter.

generate vacant1=1
replace vacant1=0 if winter_17_status=="confirmed"

generate vacant2=1
replace vacant2=0 if spring_17_status=="confirmed"

generate vacant3=1
replace vacant3=0 if summer_17_status=="confirmed"

generate vacant4=1
replace vacant4=0 if fall_17_status=="confirmed"

generate vacant5=1
replace vacant5=0 if winter_18_status=="confirmed"

generate vacant6=1
replace vacant6=0 if spring_18_status=="confirmed"

generate vacant7=1
replace vacant7=0 if summer_18_status=="confirmed"

generate vacant8=1
replace vacant8=0 if fall_18_status=="confirmed"

generate vacant9=1
replace vacant9=0 if winter_19_status=="confirmed"

generate vacant10=1
replace vacant10=0 if spring_19_status=="confirmed"

generate vacant11=1
replace vacant11=0 if summer_19_status=="confirmed"

generate vacant12=1
replace vacant12=0 if fall_19_status=="confirmed"

generate vacant13=1
replace vacant13=0 if winter_20_status=="confirmed"

generate vacant14=1
replace vacant14=0 if spring_20_status=="confirmed"

generate vacant15=1
replace vacant15=0 if summer_20_status=="confirmed"

label variable vacant1 "Winter 2017"
label variable vacant2 "Spring 2017"
label variable vacant3 "Summer 2017"
label variable vacant4 "Fall 2017"

label variable vacant5 "Winter 2018"
label variable vacant6 "Spring 2018"
label variable vacant7 "Summer 2018"
label variable vacant8 "Fall 2018"

label variable vacant9 "Winter 2019"
label variable vacant10 "Spring 2019"
label variable vacant11"Summer 2019"
label variable vacant12"Fall 2019"

label variable vacant13 "Winter 2020"
label variable vacant14 "Spring 2020"
label variable vacant15"Summer 2020"

*Table G3

reg hier vacant* directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg hier vacant1 vacant2 vacant3 vacant4 directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg hier vacant5 vacant6 vacant7 vacant8 directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg hier vacant9 vacant10 vacant11 vacant12 directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg hier vacant13 vacant14 vacant15 directPAS tott eop cabinet offsec indcom depp skills_mean_2014 ideo_rating, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

**Appendix H - see H1_pty_dis_figure.R

 **Appendix I. Democratic and Republican Ratings Separate. This gets complicated. There are a small number of ratings which makes
 **model estimation complicated. There are few degrees of freedom. It is also the case that with the D, R measures
 **some ratings are based upon partisanship, not performance. If we exclude those ratings, we estimate models on 
 **partisan ratings but only those likely to be related to performance. The Ns here are quite small.
 
* Table I1
reg perf_rating_dem mvac eop cabinet offsec if rate_diff==0, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg perf_rating_rep mvac eop cabinet offsec if rate_diff==0, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")
 
reg perf_rating_dem mvac eop cabinet offsec skills_mean_2014 if rate_diff==0, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")

reg perf_rating_rep mvac eop cabinet offsec skills_mean_2014 if rate_diff==0, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")
 

**Auxiliary Analysis

 **Footnote 6: Omitting Independent Commissions
reg hier mvac directPAS tott eop cabinet offsec depp skills_mean_2014 ideo_rating if indcom==0, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels ctitles ("Agency Performance (-2.5, 2.5)" "All Agencies")
 
reg hier mvac directPAS tott eop cabinet offsec depp skills_mean_2014 ideo_rating if directPAS==1&indcom==0, cluster(dept)
outreg, replace se bdec(3) squarebrack starlevels(10 5) varlabels merge ctitles ("Agency Performance (-2.5, 2.5)" "PAS Head Only")

**Priority and Vacancy Length. These commands test whether vacancy lenghts are 
**systematically different for priority agencies or conservative agencies.
ttest mvac, by(depp)
summarize ideo_rating, detail
generate conservative=0 if ideo_rating~=.
replace conservative =1 if ideo_rating>=0.0116309

ttest mvac, by(conservative)
